{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 特征选择（Feature Selection）：从所有的特征中，选择出有意义，对模型有帮助的特征，以避免必须将所有特征都导入模型去训练的情况。\n",
    "#### 过滤法可以理解为在机器学习算法之前的预处理，过滤法特征选择的过程完全独立与任何机器学习算法。根据对特征经过统计检验之后得到的分数，来筛选掉一些相对来说无用的特征，从而优化特征集。\n",
    "#### 过滤法的目的：在维持算法表现的前提下，帮助算法降低计算成本\n",
    "#### 方差过滤：Variance Threshold是通过特征本身方差来筛选特征的类。比如一个特征本身的方差很小，就表示样本在这个特征上基本没有差异，可能特征中的大多数值都一样，甚至整个特征的取值都相同，那这个特征对于样本区分没有什么作用。所以无论接下来的特征工程要做什么，都要优先消除方差为0的特征。VarianceThreshold有重要参数threshold，表示方差的阈值，表示舍弃所有方差小于threshold的特征，不填默认为0，即删除所有的记录都相同的特征。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 246,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "import seaborn as sns\n",
    "import matplotlib.pyplot as plt\n",
    "import sklearn\n",
    "from sklearn.feature_selection import VarianceThreshold\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.linear_model import LogisticRegression\n",
    "from sklearn.preprocessing import LabelEncoder\n",
    "from sklearn.metrics import accuracy_score  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 247,
   "metadata": {},
   "outputs": [],
   "source": [
    "#数据可视化\n",
    "df=pd.read_csv('StudentPerformance.csv')  #读取数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 248,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>gender</th>\n",
       "      <th>NationalITy</th>\n",
       "      <th>PlaceofBirth</th>\n",
       "      <th>StageID</th>\n",
       "      <th>GradeID</th>\n",
       "      <th>SectionID</th>\n",
       "      <th>Topic</th>\n",
       "      <th>Semester</th>\n",
       "      <th>Relation</th>\n",
       "      <th>raisedhands</th>\n",
       "      <th>VisITedResources</th>\n",
       "      <th>AnnouncementsView</th>\n",
       "      <th>Discussion</th>\n",
       "      <th>ParentAnsweringSurvey</th>\n",
       "      <th>ParentschoolSatisfaction</th>\n",
       "      <th>StudentAbsenceDays</th>\n",
       "      <th>Class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>M</td>\n",
       "      <td>KW</td>\n",
       "      <td>KuwaIT</td>\n",
       "      <td>lowerlevel</td>\n",
       "      <td>G-04</td>\n",
       "      <td>A</td>\n",
       "      <td>IT</td>\n",
       "      <td>F</td>\n",
       "      <td>Father</td>\n",
       "      <td>15</td>\n",
       "      <td>16</td>\n",
       "      <td>2</td>\n",
       "      <td>20</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Good</td>\n",
       "      <td>Under-7</td>\n",
       "      <td>M</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>M</td>\n",
       "      <td>KW</td>\n",
       "      <td>KuwaIT</td>\n",
       "      <td>lowerlevel</td>\n",
       "      <td>G-04</td>\n",
       "      <td>A</td>\n",
       "      <td>IT</td>\n",
       "      <td>F</td>\n",
       "      <td>Father</td>\n",
       "      <td>20</td>\n",
       "      <td>20</td>\n",
       "      <td>3</td>\n",
       "      <td>25</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Good</td>\n",
       "      <td>Under-7</td>\n",
       "      <td>M</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>M</td>\n",
       "      <td>KW</td>\n",
       "      <td>KuwaIT</td>\n",
       "      <td>lowerlevel</td>\n",
       "      <td>G-04</td>\n",
       "      <td>A</td>\n",
       "      <td>IT</td>\n",
       "      <td>F</td>\n",
       "      <td>Father</td>\n",
       "      <td>10</td>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "      <td>30</td>\n",
       "      <td>No</td>\n",
       "      <td>Bad</td>\n",
       "      <td>Above-7</td>\n",
       "      <td>L</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>M</td>\n",
       "      <td>KW</td>\n",
       "      <td>KuwaIT</td>\n",
       "      <td>lowerlevel</td>\n",
       "      <td>G-04</td>\n",
       "      <td>A</td>\n",
       "      <td>IT</td>\n",
       "      <td>F</td>\n",
       "      <td>Father</td>\n",
       "      <td>30</td>\n",
       "      <td>25</td>\n",
       "      <td>5</td>\n",
       "      <td>35</td>\n",
       "      <td>No</td>\n",
       "      <td>Bad</td>\n",
       "      <td>Above-7</td>\n",
       "      <td>L</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>M</td>\n",
       "      <td>KW</td>\n",
       "      <td>KuwaIT</td>\n",
       "      <td>lowerlevel</td>\n",
       "      <td>G-04</td>\n",
       "      <td>A</td>\n",
       "      <td>IT</td>\n",
       "      <td>F</td>\n",
       "      <td>Father</td>\n",
       "      <td>40</td>\n",
       "      <td>50</td>\n",
       "      <td>12</td>\n",
       "      <td>50</td>\n",
       "      <td>No</td>\n",
       "      <td>Bad</td>\n",
       "      <td>Above-7</td>\n",
       "      <td>M</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>F</td>\n",
       "      <td>KW</td>\n",
       "      <td>KuwaIT</td>\n",
       "      <td>lowerlevel</td>\n",
       "      <td>G-04</td>\n",
       "      <td>A</td>\n",
       "      <td>IT</td>\n",
       "      <td>F</td>\n",
       "      <td>Father</td>\n",
       "      <td>42</td>\n",
       "      <td>30</td>\n",
       "      <td>13</td>\n",
       "      <td>70</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Bad</td>\n",
       "      <td>Above-7</td>\n",
       "      <td>M</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>M</td>\n",
       "      <td>KW</td>\n",
       "      <td>KuwaIT</td>\n",
       "      <td>MiddleSchool</td>\n",
       "      <td>G-07</td>\n",
       "      <td>A</td>\n",
       "      <td>Math</td>\n",
       "      <td>F</td>\n",
       "      <td>Father</td>\n",
       "      <td>35</td>\n",
       "      <td>12</td>\n",
       "      <td>0</td>\n",
       "      <td>17</td>\n",
       "      <td>No</td>\n",
       "      <td>Bad</td>\n",
       "      <td>Above-7</td>\n",
       "      <td>L</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>M</td>\n",
       "      <td>KW</td>\n",
       "      <td>KuwaIT</td>\n",
       "      <td>MiddleSchool</td>\n",
       "      <td>G-07</td>\n",
       "      <td>A</td>\n",
       "      <td>Math</td>\n",
       "      <td>F</td>\n",
       "      <td>Father</td>\n",
       "      <td>50</td>\n",
       "      <td>10</td>\n",
       "      <td>15</td>\n",
       "      <td>22</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Good</td>\n",
       "      <td>Under-7</td>\n",
       "      <td>M</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>F</td>\n",
       "      <td>KW</td>\n",
       "      <td>KuwaIT</td>\n",
       "      <td>MiddleSchool</td>\n",
       "      <td>G-07</td>\n",
       "      <td>A</td>\n",
       "      <td>Math</td>\n",
       "      <td>F</td>\n",
       "      <td>Father</td>\n",
       "      <td>12</td>\n",
       "      <td>21</td>\n",
       "      <td>16</td>\n",
       "      <td>50</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Good</td>\n",
       "      <td>Under-7</td>\n",
       "      <td>M</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>F</td>\n",
       "      <td>KW</td>\n",
       "      <td>KuwaIT</td>\n",
       "      <td>MiddleSchool</td>\n",
       "      <td>G-07</td>\n",
       "      <td>B</td>\n",
       "      <td>IT</td>\n",
       "      <td>F</td>\n",
       "      <td>Father</td>\n",
       "      <td>70</td>\n",
       "      <td>80</td>\n",
       "      <td>25</td>\n",
       "      <td>70</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Good</td>\n",
       "      <td>Under-7</td>\n",
       "      <td>M</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  gender NationalITy PlaceofBirth       StageID GradeID SectionID Topic  \\\n",
       "0      M          KW       KuwaIT    lowerlevel    G-04         A    IT   \n",
       "1      M          KW       KuwaIT    lowerlevel    G-04         A    IT   \n",
       "2      M          KW       KuwaIT    lowerlevel    G-04         A    IT   \n",
       "3      M          KW       KuwaIT    lowerlevel    G-04         A    IT   \n",
       "4      M          KW       KuwaIT    lowerlevel    G-04         A    IT   \n",
       "5      F          KW       KuwaIT    lowerlevel    G-04         A    IT   \n",
       "6      M          KW       KuwaIT  MiddleSchool    G-07         A  Math   \n",
       "7      M          KW       KuwaIT  MiddleSchool    G-07         A  Math   \n",
       "8      F          KW       KuwaIT  MiddleSchool    G-07         A  Math   \n",
       "9      F          KW       KuwaIT  MiddleSchool    G-07         B    IT   \n",
       "\n",
       "  Semester Relation  raisedhands  VisITedResources  AnnouncementsView  \\\n",
       "0        F   Father           15                16                  2   \n",
       "1        F   Father           20                20                  3   \n",
       "2        F   Father           10                 7                  0   \n",
       "3        F   Father           30                25                  5   \n",
       "4        F   Father           40                50                 12   \n",
       "5        F   Father           42                30                 13   \n",
       "6        F   Father           35                12                  0   \n",
       "7        F   Father           50                10                 15   \n",
       "8        F   Father           12                21                 16   \n",
       "9        F   Father           70                80                 25   \n",
       "\n",
       "   Discussion ParentAnsweringSurvey ParentschoolSatisfaction  \\\n",
       "0          20                   Yes                     Good   \n",
       "1          25                   Yes                     Good   \n",
       "2          30                    No                      Bad   \n",
       "3          35                    No                      Bad   \n",
       "4          50                    No                      Bad   \n",
       "5          70                   Yes                      Bad   \n",
       "6          17                    No                      Bad   \n",
       "7          22                   Yes                     Good   \n",
       "8          50                   Yes                     Good   \n",
       "9          70                   Yes                     Good   \n",
       "\n",
       "  StudentAbsenceDays Class  \n",
       "0            Under-7     M  \n",
       "1            Under-7     M  \n",
       "2            Above-7     L  \n",
       "3            Above-7     L  \n",
       "4            Above-7     M  \n",
       "5            Above-7     M  \n",
       "6            Above-7     L  \n",
       "7            Under-7     M  \n",
       "8            Under-7     M  \n",
       "9            Under-7     M  "
      ]
     },
     "execution_count": 248,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head(10)          #读取前十行数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 249,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "gender                      object\n",
       "NationalITy                 object\n",
       "PlaceofBirth                object\n",
       "StageID                     object\n",
       "GradeID                     object\n",
       "SectionID                   object\n",
       "Topic                       object\n",
       "Semester                    object\n",
       "Relation                    object\n",
       "raisedhands                  int64\n",
       "VisITedResources             int64\n",
       "AnnouncementsView            int64\n",
       "Discussion                   int64\n",
       "ParentAnsweringSurvey       object\n",
       "ParentschoolSatisfaction    object\n",
       "StudentAbsenceDays          object\n",
       "Class                       object\n",
       "dtype: object"
      ]
     },
     "execution_count": 249,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.dtypes     #检查数据类型，因为有些数据是离散型数据，需要和字符型字段区分开来\n",
    "#由此可得，只有四个数值型，其他的都是字符型，这时候要对这些字符型进行特征工程"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 250,
   "metadata": {},
   "outputs": [],
   "source": [
    "#修改列名\n",
    "df.rename(index=str,columns={\n",
    "    'gender':'Gender',\n",
    "    'NationalITy':'Nationality',\n",
    "    'raisedhands':'RaisedHands',\n",
    "    'VisITedResources':'VisitedResources'\n",
    "},inplace=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 251,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Gender</th>\n",
       "      <th>Nationality</th>\n",
       "      <th>PlaceofBirth</th>\n",
       "      <th>StageID</th>\n",
       "      <th>GradeID</th>\n",
       "      <th>SectionID</th>\n",
       "      <th>Topic</th>\n",
       "      <th>Semester</th>\n",
       "      <th>Relation</th>\n",
       "      <th>RaisedHands</th>\n",
       "      <th>VisitedResources</th>\n",
       "      <th>AnnouncementsView</th>\n",
       "      <th>Discussion</th>\n",
       "      <th>ParentAnsweringSurvey</th>\n",
       "      <th>ParentschoolSatisfaction</th>\n",
       "      <th>StudentAbsenceDays</th>\n",
       "      <th>Class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>480</td>\n",
       "      <td>480</td>\n",
       "      <td>480</td>\n",
       "      <td>480</td>\n",
       "      <td>480</td>\n",
       "      <td>480</td>\n",
       "      <td>480</td>\n",
       "      <td>480</td>\n",
       "      <td>480</td>\n",
       "      <td>480.000000</td>\n",
       "      <td>480.000000</td>\n",
       "      <td>480.000000</td>\n",
       "      <td>480.000000</td>\n",
       "      <td>480</td>\n",
       "      <td>480</td>\n",
       "      <td>480</td>\n",
       "      <td>480</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>unique</th>\n",
       "      <td>2</td>\n",
       "      <td>14</td>\n",
       "      <td>14</td>\n",
       "      <td>3</td>\n",
       "      <td>10</td>\n",
       "      <td>3</td>\n",
       "      <td>12</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>top</th>\n",
       "      <td>M</td>\n",
       "      <td>KW</td>\n",
       "      <td>KuwaIT</td>\n",
       "      <td>MiddleSchool</td>\n",
       "      <td>G-02</td>\n",
       "      <td>A</td>\n",
       "      <td>IT</td>\n",
       "      <td>F</td>\n",
       "      <td>Father</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Good</td>\n",
       "      <td>Under-7</td>\n",
       "      <td>M</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>freq</th>\n",
       "      <td>305</td>\n",
       "      <td>179</td>\n",
       "      <td>180</td>\n",
       "      <td>248</td>\n",
       "      <td>147</td>\n",
       "      <td>283</td>\n",
       "      <td>95</td>\n",
       "      <td>245</td>\n",
       "      <td>283</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>270</td>\n",
       "      <td>292</td>\n",
       "      <td>289</td>\n",
       "      <td>211</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>46.775000</td>\n",
       "      <td>54.797917</td>\n",
       "      <td>37.918750</td>\n",
       "      <td>43.283333</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>30.779223</td>\n",
       "      <td>33.080007</td>\n",
       "      <td>26.611244</td>\n",
       "      <td>27.637735</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>15.750000</td>\n",
       "      <td>20.000000</td>\n",
       "      <td>14.000000</td>\n",
       "      <td>20.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>50.000000</td>\n",
       "      <td>65.000000</td>\n",
       "      <td>33.000000</td>\n",
       "      <td>39.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>75.000000</td>\n",
       "      <td>84.000000</td>\n",
       "      <td>58.000000</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>100.000000</td>\n",
       "      <td>99.000000</td>\n",
       "      <td>98.000000</td>\n",
       "      <td>99.000000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       Gender Nationality PlaceofBirth       StageID GradeID SectionID Topic  \\\n",
       "count     480         480          480           480     480       480   480   \n",
       "unique      2          14           14             3      10         3    12   \n",
       "top         M          KW       KuwaIT  MiddleSchool    G-02         A    IT   \n",
       "freq      305         179          180           248     147       283    95   \n",
       "mean      NaN         NaN          NaN           NaN     NaN       NaN   NaN   \n",
       "std       NaN         NaN          NaN           NaN     NaN       NaN   NaN   \n",
       "min       NaN         NaN          NaN           NaN     NaN       NaN   NaN   \n",
       "25%       NaN         NaN          NaN           NaN     NaN       NaN   NaN   \n",
       "50%       NaN         NaN          NaN           NaN     NaN       NaN   NaN   \n",
       "75%       NaN         NaN          NaN           NaN     NaN       NaN   NaN   \n",
       "max       NaN         NaN          NaN           NaN     NaN       NaN   NaN   \n",
       "\n",
       "       Semester Relation  RaisedHands  VisitedResources  AnnouncementsView  \\\n",
       "count       480      480   480.000000        480.000000         480.000000   \n",
       "unique        2        2          NaN               NaN                NaN   \n",
       "top           F   Father          NaN               NaN                NaN   \n",
       "freq        245      283          NaN               NaN                NaN   \n",
       "mean        NaN      NaN    46.775000         54.797917          37.918750   \n",
       "std         NaN      NaN    30.779223         33.080007          26.611244   \n",
       "min         NaN      NaN     0.000000          0.000000           0.000000   \n",
       "25%         NaN      NaN    15.750000         20.000000          14.000000   \n",
       "50%         NaN      NaN    50.000000         65.000000          33.000000   \n",
       "75%         NaN      NaN    75.000000         84.000000          58.000000   \n",
       "max         NaN      NaN   100.000000         99.000000          98.000000   \n",
       "\n",
       "        Discussion ParentAnsweringSurvey ParentschoolSatisfaction  \\\n",
       "count   480.000000                   480                      480   \n",
       "unique         NaN                     2                        2   \n",
       "top            NaN                   Yes                     Good   \n",
       "freq           NaN                   270                      292   \n",
       "mean     43.283333                   NaN                      NaN   \n",
       "std      27.637735                   NaN                      NaN   \n",
       "min       1.000000                   NaN                      NaN   \n",
       "25%      20.000000                   NaN                      NaN   \n",
       "50%      39.000000                   NaN                      NaN   \n",
       "75%      70.000000                   NaN                      NaN   \n",
       "max      99.000000                   NaN                      NaN   \n",
       "\n",
       "       StudentAbsenceDays Class  \n",
       "count                 480   480  \n",
       "unique                  2     3  \n",
       "top               Under-7     M  \n",
       "freq                  289   211  \n",
       "mean                  NaN   NaN  \n",
       "std                   NaN   NaN  \n",
       "min                   NaN   NaN  \n",
       "25%                   NaN   NaN  \n",
       "50%                   NaN   NaN  \n",
       "75%                   NaN   NaN  \n",
       "max                   NaN   NaN  "
      ]
     },
     "execution_count": 251,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.describe(include='all')   #查看所有数据详细信息"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 252,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Gender                      0\n",
       "Nationality                 0\n",
       "PlaceofBirth                0\n",
       "StageID                     0\n",
       "GradeID                     0\n",
       "SectionID                   0\n",
       "Topic                       0\n",
       "Semester                    0\n",
       "Relation                    0\n",
       "RaisedHands                 0\n",
       "VisitedResources            0\n",
       "AnnouncementsView           0\n",
       "Discussion                  0\n",
       "ParentAnsweringSurvey       0\n",
       "ParentschoolSatisfaction    0\n",
       "StudentAbsenceDays          0\n",
       "Class                       0\n",
       "dtype: int64"
      ]
     },
     "execution_count": 252,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 针对所有的数据看是否有为空的数据，因为如果有为空的数据我们就要去想办法填充或者解决\n",
    "df.isnull().sum()        #查看数据是否缺失"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 253,
   "metadata": {},
   "outputs": [],
   "source": [
    "df=df.dropna()     #df.dropna()函数用于删除数据中的缺失数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 254,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Gender                      480\n",
       "Nationality                 480\n",
       "PlaceofBirth                480\n",
       "StageID                     480\n",
       "GradeID                     480\n",
       "SectionID                   480\n",
       "Topic                       480\n",
       "Semester                    480\n",
       "Relation                    480\n",
       "RaisedHands                 480\n",
       "VisitedResources            480\n",
       "AnnouncementsView           480\n",
       "Discussion                  480\n",
       "ParentAnsweringSurvey       480\n",
       "ParentschoolSatisfaction    480\n",
       "StudentAbsenceDays          480\n",
       "Class                       480\n",
       "dtype: int64"
      ]
     },
     "execution_count": 254,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.count()         #查看特征下有多少数据，发现所有都为480个，证明没有缺失的数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 255,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Gender                      object\n",
       "Nationality                 object\n",
       "PlaceofBirth                object\n",
       "StageID                     object\n",
       "GradeID                     object\n",
       "SectionID                   object\n",
       "Topic                       object\n",
       "Semester                    object\n",
       "Relation                    object\n",
       "RaisedHands                  int64\n",
       "VisitedResources             int64\n",
       "AnnouncementsView            int64\n",
       "Discussion                   int64\n",
       "ParentAnsweringSurvey       object\n",
       "ParentschoolSatisfaction    object\n",
       "StudentAbsenceDays          object\n",
       "Class                       object\n",
       "dtype: object"
      ]
     },
     "execution_count": 255,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.dtypes   #检查数据类型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 256,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>RaisedHands</th>\n",
       "      <th>VisitedResources</th>\n",
       "      <th>AnnouncementsView</th>\n",
       "      <th>Discussion</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>RaisedHands</th>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.691572</td>\n",
       "      <td>0.643918</td>\n",
       "      <td>0.339386</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>VisitedResources</th>\n",
       "      <td>0.691572</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.594500</td>\n",
       "      <td>0.243292</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AnnouncementsView</th>\n",
       "      <td>0.643918</td>\n",
       "      <td>0.594500</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.417290</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Discussion</th>\n",
       "      <td>0.339386</td>\n",
       "      <td>0.243292</td>\n",
       "      <td>0.417290</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                   RaisedHands  VisitedResources  AnnouncementsView  \\\n",
       "RaisedHands           1.000000          0.691572           0.643918   \n",
       "VisitedResources      0.691572          1.000000           0.594500   \n",
       "AnnouncementsView     0.643918          0.594500           1.000000   \n",
       "Discussion            0.339386          0.243292           0.417290   \n",
       "\n",
       "                   Discussion  \n",
       "RaisedHands          0.339386  \n",
       "VisitedResources     0.243292  \n",
       "AnnouncementsView    0.417290  \n",
       "Discussion           1.000000  "
      ]
     },
     "execution_count": 256,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#相关性矩阵\n",
    "corrDF=df[['RaisedHands','VisitedResources','AnnouncementsView','Discussion']].corr()\n",
    "corrDF  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 257,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<AxesSubplot:>"
      ]
     },
     "execution_count": 257,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAbwAAAFbCAYAAACu8TvlAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAABF00lEQVR4nO3dd5xU1fnH8c93Z+llWdqCgIKAYKGKqCgWLLHErlFSNCYGjT1K8ovlZ429B40E/aFYScQSYsNYKHZAkBZQpCgivcOysLvP7497d5lddmGRmbkL93n7mhdz7z1z5plxdp455557jswM55xzbneXFXUAzjnnXCZ4wnPOORcLnvCcc87Fgic855xzseAJzznnXCx4wnPOORcLnvCcc85VO5KGSloiaVolxyXpr5JmS5oiqef26vSE55xzrjp6GjhhG8dPBDqGtwHA49ur0BOec865asfMxgIrtlHkNOAZC3wKNJLUclt1ZqcyQFe91Nmzv0+jE9rzpt9HHUK10aSx/84t8frJ2/o+jZfGtU7VztaxI985G78bfjFBy6zEEDMbsgNP1wr4Lml7Qbjvh8oe4AnPOedcSkhV/zEVJrcdSXBbPV1F1W7rAZ7wnHPOpYQye5ZsAdAmabs1sHBbD/C+DeeccykhZVX5lgIjgfPD0ZqHAKvNrNLuTPAWnnPOuRRJUSIL69KLwFFAU0kLgJuBGgBmNhh4EzgJmA1sAC7cXp2e8JxzzqWElEhZXWbWfzvHDbhsR+r0hOeccy4lUtnCSwdPeM4551LCE55zzrlYyPAozR3mCc8551xKeAvPOedcLHjCc845FwtZKRylmQ6e8JxzzqWEt/Ccc87Fgic855xzseAJzznnXEx4wnPOORcDWVnVO6VU7+icc87tMvzCc+ecc7Hg5/Ccc87FglTRIuTVhyc855xzKeEtPOecc7Hg5/Ccc87Fgo/STDNJRcBUgtcyF/iVma3aRvlewPlmduVOPu8twDozu1/S08DrZjYi6fg6M6u/M88R1tM2rPuAna0r0wbfdzEnHtODpcvX0Ou4P0UdTlr1bZ3LDYe0J0vipVmLeGLKd1uV6d0yh+sPaU92lli5cTO/emMKAOfvvwfndGqJBC/NXMSw6d9nOvyUOjivEVd33ZuExL/nLebZrxZsVaZH0xyu6tqO7CyxuqCQy8ZNLT2WBQzt152l+Zv44yczMhh56n3y4UwevmckRcXFnHpmb87/bb8yx8d+MI0hj44iK0skEgmu/tOpdOvZrvR4UVExF/Z/hGbNc3jg0d9kOvwd5i289Ms3s+4AkoYRLPl+R2WFzWwCMCEzocXbsy+NYfCwUTz50KVRh5JWWYKb+nTgwremsnh9ASNO68H73y7nm1UbSss0qJng5j4duOjtafywvoDGtWsA0DG3Lud0ask5/5rE5uJinjyhC6O/W878NRujejk7JQsY2K09V304jSX5m/i/o7sz7oflzFubX1qmfo0EA7u355qPprM4v4DcWjXK1PGzDnswb+0G6mXv2l9PRUXFPHDnqzwyZADN83L4Tf+/0veo/WnXPq+0TK+DO9L3qP2RxOyvFnLDwOf4x8gtPw7/+fw42rZrzvr1BVG8hB1Xzc/hVe/odtwnQCsASb0lfSxpUvhvp3D/UZJeD+8fKWlyeJskqUG4/4+SxkuaIunWksol3SBplqR3gU5VCUhSfUnvSfpC0lRJp4X720r6r6QnJE2X9I6kOuGxAyV9KekTggReUtf+kj4P450iqWNK3rU0+ejzmaxYtS7qMNKua7MGzF+Tz4K1G9lcbLwxZynH7NWkTJlT2jfnP/OW80P4xbVi42YA2jeqy5dL17CxqJgig/E/rOa4tk0z/hpSZb/GDViwfiMLNxRQaMa7C5bSt2XZ9+L4Ns0Ys3AZi/OD92JlwebSY83q1KRPi8b8e97ijMadDjOmfUvrPZvSqnUTatTI5tgTujP2g+llytStW6t0ZGN+/qYyoxyXLFrFR2NncuqZB2c07p0hZVX5FoXdJuFJSgDHACPDXTOBI8ysB3ATcGcFDxsIXBa2EPsC+ZKOBzoCvYHuwIGSjpB0IHAe0AM4EzioXF33JSXPyUn7NwJnmFlP4GjgAW35VHcEHjOz/YFVwFnh/qeAK83s0HLPcQnwSBhvL2DrviKXcXl1a7Eo6Rf44vUF5NWtWaZM25w6NKyVzTMnd+Xl03twWofmAHy1cj29WuTQqFY2tRNZHNGmMS3q1cpo/KnUrHbN0kQGsDS/gGZ1yr4XberXoUGNbB7t24WhR3fnhD2blx67uuvePDZtLsWWsZDTZuniNTTPa1S63Twvh6VLVm9VbvR7Uzn31Hu59rKh3HDbOaX7H753JJdfczJZWdV7qH8ySVW+VaGuE8IGxmxJf67geK6kV8Mf/59L2u5pn127zyBQJ0wwbYGJwH/C/TnAsLAVZECNCh77EfCgpOeBV8xsQZjwjgcmhWXqEySmBsCrZrYBQNLIcnX9sfw5vJK7wJ2SjgCKCVqgJX0ac81scnh/ItBWUg7QyMzGhPufBU4M738C3CCpdRjv19t7c1z6VfS3W/77OiGxf9P6/PrNKdROZDH81B58uWQtc1bl8+SXCxh6Yhc2bC5m1op1FO3K3/ZVfC865dbnynHTqJXIYshR3Zi+Yg1t6tdhZcFmZq1aT4+mORkJN51sq1de8XVqRx3ThaOO6cKkCXMY8ugoBj1xMR+OmUFu4/p03q81X4z/JhPhpkSqzuGFDZjHgOMIftiPlzTSzJJP6l4PTDazMyR1Dssfs616d4cWXsk5vL2AmmzpArwd+CAc7HEKULv8A83sbuAioA7wafimCbjLzLqHtw5m9n8lD/kR8f0CaAYcGMa5OCmW5I75IoIfIKrseczsBeBUIB8YJalf+TKSBkiaIGlC4brZPyJct6MWrS8o0yrLq1eLJRs2lSuziXELVpJfWMzKgkImLFpN5yb1ABjx1SLOfG0Sv3zjS1YVFDJ/TT67qqX5m8irs+W9aFanFsvyN21V5rPFq9hYVMzqTYVMXraaDjn16NqkIYe3bMzLP+nFbb07cWCzHG7utU+mX0LKNM/LYcniVaXbSxavpmmzhpWW79Frb77/bjmrVq5nyuR5jBs9gzNOuJP//dNzTPx8Nrdc90IGot45ykpU+bYdvYHZZjbHzDYBw4HTypXZD3gPwMxmEjQY8tiG3SHhAWBmq4ErgYGSahC08EqGu/26osdIam9mU83sHoKBLJ2BUcBvJNUPy7SS1BwYC5whqU54ru+UKoaWAywxs82SjiZIzNt6HauA1ZIOD3f9IinevYE5ZvZXgq7brhU8foiZ9TKzXtn1O1QxRLczpi5dS9uGdWhdvzY1ssTJezfj/fnLy5R579tl9MprSEJQO5FF12YNSge1lAxgaVmvFse3bcrr3yzN+GtIlf+uXEvr+nVoWbcW2RLHtm7Ghz+sKFNm7A/L6dYkeC9qJbLYP7cB89fmM3j6fE5/azxnjZrATZ/PYuLS1dw64auIXsnO23f/Nnw3fxkLF6xg8+ZC3n17Mn2P2q9Mme++XYZZ8Pt21owFbC4sIqdRXS696iRGvnsjr759Pbff+0sO7N2BW+76eRQvY8dkVf2W/OM8vA1IqqkVkDzUeUG4L9mXBKeXkNSb4Lu19bbC2x26NEuZ2SRJXxKca7uXoEvzGuD9Sh5ydZiEioAZwFtmViBpX+CTsPthHfBLM/tC0j+AycB8YFwVw3oe+LekCeFjZ1bhMRcCQyVtIEjAJc4FfilpM7AIuK2KMURi2KAr6HvovjTNbcDszx7l9gdHMOwfo6MOK+WKDG77eDZPnngACYmXv1rE7FUbOK9zSwCGz/yBOavyGbdgJSPPPJBigxGzFvH1yiDhDTp2PxrVyqaw2Lj149ms2VQY5cvZKUUGD07+hocOO4CE4PX5i5m7dgOnt2sBwGtzFzF/bT6fLl7JM8f0xMwYOW8xc9Zs2E7Nu57s7ATXXn86V//+CYqLivnp6b3Zu0MLXvnnJwCc+bNDGf3uVN7690Sys7OoVasGf7n3l9V+eq5t2oHYzWwIMKSymip6SLntu4FHwlNaUwlOQ23zj0clvy7c7qfOnv39f25oz5t+H3UI1UaTxrtNx85Oe/3kFdsvFBONa52605l2nz6Dq/yd89XHl1T6fJIOBW4xs5+E29cBmNldlZQXwXXYXc1sTWX1+iffOedcauxAl+Z2jAc6SmonqSZBr12ZgYKSGoXHIBiLMXZbyQ52sy5N55xz0bEUXUJhZoWSLic4pZMAhprZdEmXhMcHA/sCzyiYbWsG8Nvt1esJzznnXGqk8JpBM3sTeLPcvsFJ9z8huGSsyjzhOeecS41qPuDGE55zzrnUqN75zhOec865FKnm06B5wnPOOZca3qXpnHMuFhKe8JxzzsVB9c53nvCcc86lhnmXpnPOuVjwQSvOOedioXrnO094zjnnUsS7NJ1zzsWCj9J0zjkXC97Cc845Fwue8JxzzsVCNV9h1ROec8651PAWnnPOuTgwH7TinHMuFryF55xzLhaqd77zhOeccy5FfGox55xzseBdmi4qe970+6hDqDa+ve3xqEOoNro87Z+LEo1rdY46hN1LCvOdpBOAR4AE8KSZ3V3ueA7wHLAnQS6738ye2ladnvCcc86lRnZqLsSTlAAeA44DFgDjJY00sxlJxS4DZpjZKZKaAbMkPW9mmyqrt5pfJuicc25XYar6bTt6A7PNbE6YwIYDp5V/OqCBJAH1gRVA4bYq9YTnnHMuNbJU5ZukAZImJN0GJNXUCvguaXtBuC/Zo8C+wEJgKnCVmRVvKzzv0nTOOZcaOzBoxcyGAEMqq6mih5Tb/gkwGegHtAf+I2mcma2p7Dm9heeccy41dqCFtx0LgDZJ260JWnLJLgRescBsYC6wzVFInvCcc86lRtYO3LZtPNBRUjtJNYHzgJHlynwLHAMgKQ/oBMzZVqXepemccy41EqlpQ5lZoaTLgVEElyUMNbPpki4Jjw8GbgeeljSVoAv0f8xs2bbq9YTnnHMuJSyFF56b2ZvAm+X2DU66vxA4fkfq9ITnnHMuNar5STJPeM4551LD59J0zjkXCz6XpnPOuVjwBWCdc87FgXmXpnPOuVjwhOeccy4W/Byec865WPDLEpxzzsWCt/Ccc87FQooWgE0XT3jOOedSIpVTi6WDJzznnHOpUb0beJ7wnHPOpYi38JxzzsWCX4e3haTRwF1mNipp39XAlcAQM7u7ksf1As43syslHQVsMrOPd/C55wG9zGyZpCJgKsHrnwv8ysxW7ejribu+rXO54ZD2ZEm8NGsRT0z5bqsyvVvmcP0h7cnOEis3buZXb0wB4Pz99+CcTi2R4KWZixg2/ftMh59Rg++7mBOP6cHS5Wvoddyfog4nrdZOn8bCfw4HKyb3sL40/8mJZY6v+2oW8x9/jJpNmwDQsHtP8k4+BYBl77/Lig/HAUbjw46g6THHZjr8lBo7diJ33PEExcXFnHPOcQwYcE6Z4yNHjuaJJ14GoF692txyy6V07tyu9HhRURFnnXUNeXmN+fvfb85o7D+KJ7wyXiRYuXZU0r7zgAvMbFxlDzKzCcCEcPMoYB2wQwmvnHwz6w4gaRhwGXDHTtS3wyQlzKwok8+ZSlmCm/p04MK3prJ4fQEjTuvB+98u55tVG0rLNKiZ4OY+Hbjo7Wn8sL6AxrVrANAxty7ndGrJOf+axObiYp48oQujv1vO/DUbo3o5affsS2MYPGwUTz50adShpJUVF7Nw+Au0u/IPZOfm8s3dd9Cwazdqt9yjTLl6HTrQ9rIry+zb+P33rPhwHB3+fD1KZDN30CM06NKFWs3zMvkSUqaoqIjbbhvMU0/dTl5eE84++xr69TuYDh32LC3TunUezz13Fzk59RkzZgL/+7+P8tJLD5Qef+aZf9O+fWvWrdtQ0VNUO1bN59LM9CnGEcBPJdUCkNQW2APoIOnRcN85kqZJ+lLS2HDfUZJeD8tfAvxB0mRJfSU1k/SypPHh7bDwMU0kvSNpkqS/E6yIW5FPgFbhY9pLelvSREnjJHXeRky1JT0laWr4HEeH+39d8lrC7dfDVimS1km6TdJnwKGSzpc0Jaz32bBMZa/nyPA1Tw6fr0FK/o/8SF2bNWD+mnwWrN3I5mLjjTlLOWavJmXKnNK+Of+Zt5wf1hcAsGLjZgDaN6rLl0vXsLGomCKD8T+s5ri2TTP+GjLpo89nsmLVuqjDSLsN8+ZSs1kzajZrRlZ2Njm9DmLNl5Or9NiCRT9Qt93eZNWshRIJ6u2zD2smT0pvwGk0ZcrX7LVXS9q0aUHNmjU4+eQjeO+9z8qU6dlzX3Jy6gPQvXtnFi3asmD3okXLGD16PGefvUNrnEZLqvotAhlNeGa2HPgcOCHcdR7wD8CSit0E/MTMugGnlnv8PGAw8JCZdQ9bhY+E2wcBZwFPhsVvBj40sx7ASGBPypGUAI4JjwMMAa4wswOBgcDfthHTZWFMXYD+wDBJtbfzFtQDppnZwcBK4AagX1jvVWGZyl7PQOCysGXaF8jfznOlVV7dWiwKExnA4vUF5NWtWaZM25w6NKyVzTMnd+Xl03twWofmAHy1cj29WuTQqFY2tRNZHNGmMS3q1cpo/C49CletokZu49LtGrm5bF61aqtyG+bO4eu/3MrcQY+wcWHQnV1rj1asn/0VhevWUbypgLXTprJ55YpMhZ5yixcvp0WLLT/k8vKasHjx8krLjxjxDkcccWDp9p13PsEf/3ghWVnVfOhjsixV/RaBKAatlHRr/iv89zdA16TjHwFPS/on8EoV6jsW2E9bfjE0DFs/RwBnApjZG5JWJj2mjqTJQFtgIvAfSfWBPsBLSXWVfAtXFNPhwKCw/pmS5gP7bCfWIuDl8H4/YISZLQvrKPnLruz1fAQ8KOl54BUzW1DRE0gaAAwAaP6ra2l0xKkVFdtpFf1As3LbCYn9m9bn129OoXYii+Gn9uDLJWuZsyqfJ79cwNATu7BhczGzVqyjqLj8o90uybb+/1j+s1KnzZ50+svdJGrXZs20qcwf/Dc63XYHtVu2pNnxJzD3rw+RVasWdVq3hqxEhgJPPavwvaj4i/7TT6cwYsR/eOGFewD44IPPadw4hwMO6MBnn01Na5wpVb17NCNJeK8RfHH3BOqY2ReSShOemV0i6WDgZGCypO7bqS8LONTMyrR4wg9WZd+i+WbWXVIO8DpBa+1pYFXJub1klcRU2f/aQsq2nJNbfRuTztupkvgqfD3A3ZLeAE4CPpV0rJnNrCDWIQQtVTo9OTZtWWTR+oIyrbK8erVYsmFTuTKbWFmwkvzCYvILi5mwaDWdm9Rj3pp8Rny1iBFfLQLgD73asjipteh2Xdm5uWVaZZtXriQ7p1GZMok6dUrvNzygCwtffJ7CdWvJrt+Axof1pfFhfQFY9Nor1MjNzUjc6dCiRdMyXZSLFy+nefPGW5WbOXMuN944iCeeuIXc3IYAfPHFf3n//c8ZO3YiBQWbWLduAwMHPsD991+bsfh/jFQ2RiWdQNDjlQCeLD+oUdIfgV+Em9nAvkCzpMbD1vGlLryqMbN1wGhgKEFrrwxJ7c3sMzO7CVgGtClXZC2QfP7qHeDypMd3D++OJXwzJJ0IbPWXY2arCUaIDiToIpwr6ZzwMZLUbRsxJde/D0GX6SxgHtBdUpakNkDvSt6K94CfSWoS1lHyl1Dh6wljmGpm9xAM4OlcSb0ZMXXpWto2rEPr+rWpkSVO3rsZ788v213z3rfL6JXXkISgdiKLrs0alA5qKRnA0rJeLY5v25TXv1ma8dfgUq/uXm0pWLKETcuWUlxYyOoJ42nYtVuZMptXry5t/WyYNxfMSNQLzmMVrlkDwKYVy1kzeRKNelX251P9denSkXnzFvLdd4vYtGkzb7wxln79yr6ehQuXcMUVd3HvvdfQrl2r0v3XXnsBY8c+zfvv/x8PPvgnDjmka7VPdhAkvKretiU83fQYcCKwH9Bf0n7JZczsvvDUVnfgOmDMtpIdRHcd3osEXYPnVXDsPkkdCVpA7wFfAkcmHf83MELSacAVBAnrMUlTCF7PWIKBLbcCL0r6AhgDfFtRIGY2SdKXYSy/AB6XdCNQAxgePn9FMc0EBkuaStCq+7WZFUj6iOBSh6nANOCLSp53uqQ7gDHhZRKTgF9v4/VcHQ6MKQJmAG9VVG+mFBnc9vFsnjzxABISL3+1iNmrNnBe55YADJ/5A3NW5TNuwUpGnnkgxQYjZi3i65VBwht07H40qpVNYbFx68ezWbOpMMqXk3bDBl1B30P3pWluA2Z/9ii3PziCYf8YHXVYKadEgj3O+zlzBz0MxUZun8OovUcrlo8dDUCTI45i9aSJrBg7GmUlUI0atPnt70q7+uYPeZyi9etL60nUqxfdi9lJ2dkJbrrpEi666GaKioo566xj6dhxL158MfjT7d//RB57bDirVq3h1lsfByCRSPDKKw9FGfZOqazL9kfoDcw2szlhvcOB0wi++yrSnwoaUFvFV1E/s9s9pLNLc1fz7W2PRx1CtXHy07+POoRqY0S/FlGHUI3ss9PZqsPgqn/nzL7kiEqfT9LZwAlmdlG4/SvgYDO7vIKydYEFQIfttfB2oeE/zjnnqrMduSpB0gBJE5JuA5KrqqD6ypLpKcBH20t24FOLOeecSxHtQBMqeYBdBRZQdvxGa2BhJWXPowrdmeAtPOeccymSwuvOxwMdJbWTVJMgqY0sXygcaX8kwWVu2+UtPOeccymRSFETyswKJV1OMA1lAhgaDvS7JDw+OCx6BvCOma2vSr2e8JxzzqVEKmcMM7M3gTfL7Rtcbvtpgmuoq8QTnnPOuZRI4WUJaeEJzznnXErsyKCVKHjCc845lxLVvIHnCc8551xqVPeFHTzhOeecS4lqvuC5JzznnHOp4V2azjnnYsETnnPOuVhQNe/T9ITnnHMuJbyF55xzLhZ8lKZzzrlYqOY9mp7wnHPOpYZ3aTrnnIsFn1rMOedcLHgLzznnXCz4agnOOediwUdpOueci4Vq3sDzhLc7a9K4mv/cyqAuT/8+6hCqjTd+/XjUIVQbkyb9IuoQqo0eTfbZ6Tr8sgTnnHOx4AnPOedcLGTJog5hm7zPyznnXEpkq+q37ZF0gqRZkmZL+nMlZY6SNFnSdEljthvfjr8k55xzbmupauFJSgCPAccBC4Dxkkaa2YykMo2AvwEnmNm3kppvN76UROeccy72slT123b0Bmab2Rwz2wQMB04rV+bnwCtm9i2AmS3Zbnw7/pKcc865rWXtwE3SAEkTkm4DkqpqBXyXtL0g3JdsHyBX0mhJEyWdv734vEvTOedcSuzIKE0zGwIMqeRwRTWV7y/NBg4EjgHqAJ9I+tTMvqrsOT3hOeecSwmlbpTmAqBN0nZrYGEFZZaZ2XpgvaSxQDeg0oTnXZrOOedSIoWjNMcDHSW1k1QTOA8YWa7Mv4C+krIl1QUOBv67zfh+3MtyzjnnykrVKE0zK5R0OTAKSABDzWy6pEvC44PN7L+S3gamAMXAk2Y2bVv1esJzzjmXEqmcacXM3gTeLLdvcLnt+4D7qlqnJzznnHMpUd3PkXnCc845lxI+l6ZzzrlYqO5zaXrCc845lxJVmSMzSp7wnHPOpYS38JxzzsWCn8NzzjkXC57wnHPOxYJfluCccy4WsrP8HJ5zzrkY8Baec865WNgtzuFJOgN4BdjXzGamN6Rdh6TrzezObRy/BahlZtcl7esOvEiwhtNfzezsdMeZLgfnNeLqrnuTkPj3vMU8+9WCrcr0aJrDVV3bkZ0lVhcUctm4qaXHsoCh/bqzNH8Tf/xkRgYjT72106ex8J/DwYrJPawvzX9yYpnj676axfzHH6Nm0yYANOzek7yTTwFg2fvvsuLDcYDR+LAjaHrMsZkOP6MG33cxJx7Tg6XL19DruD9FHU5aTf50JsMefo3iomL6nXIwp51/TIXlvpnxLTcO+CtX3fYrDunXjWWLV/K3219k1fK1ZGWJfqcewknnHpHh6HdcCpcHSouqtvD6Ax8SLNFwS9qi2fVcD1Sa8AgS21vAdUn7zgNeMLOFwC6b7LKAgd3ac9WH01iSv4n/O7o7435Yzry1+aVl6tdIMLB7e675aDqL8wvIrVWjTB0/67AH89ZuoF72rt3RYMXFLBz+Au2u/APZubl8c/cdNOzajdot9yhTrl6HDrS97Moy+zZ+/z0rPhxHhz9fjxLZzB30CA26dKFW87xMvoSMevalMQweNoonH7o06lDSqriomKH3v8INj1xMk+Y5XP/bhzmw7/60btdiq3Iv/O0Nuh3cqXRfIpHgV1ecSrtOrclfv5HrfvMQXXvvs9Vjq5vq3sLbbperpPrAYcBvCb6skXRUuKz6CEkzJT0vSeGxeZJulfSFpKmSOof7G0t6TdIUSZ9K6hruv0XSwKTnmyapbXj7r6QnJE2X9I6kOmGZDpLelfRl+Dztw/1/lDQ+fI5bw31twxifDOt+XtKxkj6S9LWk3mG5epKGho+fJOm0cP+vJb0i6e2w/L3h/ruBOpImh3XWk/RGGNM0Seea2SxglaSDk97SnwHDw7imhXUlJN2XFPvF4f6/STo1vP+qpKHh/d9K+suP+P+dMvs1bsCC9RtZuKGAQjPeXbCUvi2blClzfJtmjFm4jMX5BQCsLNhceqxZnZr0adGYf89bnNG402HDvLnUbNaMms2akZWdTU6vg1jz5eQqPbZg0Q/Ubbc3WTVroUSCevvsw5rJk9IbcMQ++nwmK1atizqMtJs941tatG5CXqsmZNfIps+xPZgwbvpW5d4e8SG9j+5Cw9z6pftymzakXafWANSpV5tWe+WxYunqjMX+Y2XtwC2q+LbndODtcNn0FZJ6hvt7AFcD+wF7EyTFEsvMrCfwOFCSzG4FJplZV4KW0TNVeO6OwGNmtj+wCjgr3P98uL8b0Af4QdLxYfneQHfgQEklfQAdgEeArkBn4OfA4WFs14dlbgDeN7ODgKOB+yTVC491B84FugDnSmpjZn8G8s2su5n9AjgBWGhm3czsAODt8LEvsuWHwiHAcjP7utzr/C2wOnzug4DfSWoHjAX6hmVaEbzXhLGPq8L7lzbNatcsTWQAS/MLaFanZpkyberXoUGNbB7t24WhR3fnhD2blx67uuvePDZtLsXVuwekSgpXraJGbuPS7Rq5uWxetWqrchvmzuHrv9zK3EGPsHHh9wDU2qMV62d/ReG6dRRvKmDttKlsXrkiU6G7NFqxdDVN8hqVbjdulrNV0lqxdDXjx0zluNP7VFrPkh9WMO/r7+mw/17pCjVlsrOsyrcoVCXh9QeGh/eHh9sAn5vZAjMrBiYDbZMe80r478Sk/YcDzwKY2ftAE0k523nuuWY2ObkuSQ2AVmb2aljXRjPbABwf3iYBXxAkto5J9UwNY50OvGdmBkxNiu944M+SJgOjgdrAnuGx98xstZltBGYAFX3ypgLHSrpHUl8zK/lkDwfOlpRFkPherOCxxwPnh8/9GdAkjH0cwYq++4XPu1hSS+BQ4OOK3jBJAyRNkDRh8TvlFwhOoQq6Lsp/hBMSnXLrM/Dj6fzho2lc2LkNberXpk+LXFYWbGbWqvXpiy+TbOs/XpV7f+q02ZNOf7mbjjfeTJOj+zF/8N8AqN2yJc2OP4G5f32IuYMeoU7r1pCVyETULgLlPxfDHn6Nn1/6U7ISFX8Vb9xQwEPXD+OCq06jbr3aGYhw52Sp6rcobPPkiaQmQD/gAAVnIxME32tvAgVJRYvK1VVQwf6KXqIBhZRNvMn/V8s/R51K6imp/y4z+3u519C2XD3FSdvF5eI7K+yGTH78wRXEsdX7ZmZfSToQOAm4S9I7ZnabmX0naR5wJEEL9dBKYr/CzEZtdUDKJWg9jgUaE3SJrjOztRXUg5kNAYYA9Hnlw7T9jFqav4m8OrVKt5vVqcWy/E1blVm9aRUbi4rZWFTM5GWr6ZBTj06N6nN4y8YcmpdLzUQW9bIT3NxrH26d8FW6wk2r7NzcMq2yzStXkp3TqEyZRJ06pfcbHtCFhS8+T+G6tWTXb0Djw/rS+LCgIb/otVeokZubkbhdejVulsPyxatKt1csXU1u07K/8efMXMAjNz0LwNrV65n88UwSiSwOOrILhYVFPHj90xx+fE96H9U1k6H/aNX9p9r2WnhnA8+Y2V5m1tbM2gBzCVprO2os8AsIzgESdHuuAeYBPcP9PYF226okfMwCSaeHj6klqS7BUvC/Cc85IqmVpOaV17SVUcAVSecie1ThMZsl1QjL7wFsMLPngPtLXlPoReAh4Bsz23ooY/Dcv0+qa5+k7tRPCLqOxxK0+AYScXcmwH9XrqV1/Tq0rFuLbIljWzfjwx/KdsWN/WE53Zo0JCGolchi/9wGzF+bz+Dp8zn9rfGcNWoCN30+i4lLV++yyQ6g7l5tKViyhE3LllJcWMjqCeNp2LVbmTKbV6/GwpbghnlzwYxEveCcTeGaNQBsWrGcNZMn0ahX78y+AJcW7fdtw6IFy1iycDmFmwv5+N1JHHj4/mXKDHr5Bh595UYefeVGDj66K78ZeCYHHdkFM+Pvd/6DVm3zOLn/kRG9gh2XJavyLQrbGx7XH7i73L6Xgd8D3+zgc90CPCVpCrABuCCpvpLuvPFAVb75fgX8XdJtwGbgHDN7R9K+wCdhzloH/JKgRVYVtwMPA1PCpDcP+Ol2HjMkLP8FwTnJ+yQVhzH9PqncSwTnEK+opJ4nCbpWvwifeynBuVMIktvxZjZb0nyCVl7kCa/I4MHJ3/DQYQeQELw+fzFz127g9HAU2WtzFzF/bT6fLl7JM8f0xMwYOW8xc9ZsiDjy1FMiwR7n/Zy5gx6GYiO3z2HU3qMVy8eOBqDJEUexetJEVowdjbISqEYN2vz2d4SfU+YPeZyi9etL60nUq1f5k+0Ghg26gr6H7kvT3AbM/uxRbn9wBMP+MTrqsFIukZ3gwmvO5M4/DKG4yDj6p71ps3cL/vNqcDbiuDMqP283a8pcxr09kT3bt+R/LngAgPMuPokeffbNSOw/VnUfpSmr4PyD2z2ks0tzV7NHo+KoQ6g23vj141GHUG18POkXUYdQbfRo8tOdTlf3TvlPlb9z/tT1uIynx+o+E4xzzrldRI2sqt+2R9IJkmZJmi3pzxUcP0rS6vDSsMmSbtpenbv2Fb/OOeeqjVSdm5OUAB4DjgMWAOMljTSz8lMyjTOz7Z162hJfSqJzzjkXeym8LKE3MNvM5pjZJoLLu07b6fh2tgLnnHMOgssSqnpLvmY4vA1IqqoV8F3S9oJwX3mHhrNbvSVp/wqOl+Fdms4551JiR0ZpJl8zXIHKrttO9gWwl5mtk3QS8BpbJhupOL6qh+ecc85VrkaWVfm2HQuANknbrYGFyQXMbI2ZrQvvvwnUkNR0W5V6wnPOOZcSKTyHNx7oKKmdpJoE0zKWmStRUoukiUJ6E+Sz5duq1Ls0nXPOpUSqLjw3s0JJlxPMQpUAhprZdEmXhMcHE8wE9ntJhUA+cJ5t58JyT3jOOedSIpUzrYTdlG+W2zc46f6jwKM7UqcnPOeccymR2E1WPHfOOee2qboPCvGE55xzLiWyq3nG84TnnHMuJbxL0znnXCxU9+WBPOE555xLCU94zjnnYsETnnPOuViowpRhkfKE55xzLiWq+SBNT3jOOedSw7s0nXPOxULCE55zzrk4yPLr8JxzzsWBd2m6yLx+8oqoQ6g2GtfqHHUI1cakSb+IOoRqo0+P56MOodrI//anO11Htic855xzcSBPeM455+Kgmuc7T3jOOedSw1t4zjnnYsEvPHfOORcL8ssSnHPOxUF1vyyhurdAnXPO7SK0A7ft1iWdIGmWpNmS/ryNcgdJKpJ09vbq9Baec865lEhVC09SAngMOA5YAIyXNNLMZlRQ7h5gVJXiS014zjnn4i6FLbzewGwzm2Nmm4DhwGkVlLsCeBlYUpX4POE555xLCWlHbhogaULSbUBSVa2A75K2F4T7kp5LrYAzgMFVjc+7NJ1zzqXEjrSgzGwIMKSSwxU1AssPAX0Y+B8zK1IVLwD0hOeccy4lUjhKcwHQJmm7NbCwXJlewPAw2TUFTpJUaGavVVapJzznnHMpkcKrEsYDHSW1A74HzgN+nlzAzNqVPq/0NPD6tpIdeMJzzjmXIqm68NzMCiVdTjD6MgEMNbPpki4Jj1f5vF0yT3jOOedSIpXXnZvZm8Cb5fZVmOjM7NdVqdMTnnPOuZTwyaOdc87FQsITnnPOuTio5vnOE55zzrnU8C5N55xzsVDN850nPOecc6lR3ZcH8oTnnHMuJap5votfwpNUBEwFagCFwDDgYTMrltQLON/MrowgrieBB8svf1GdffLhTB6+ZyRFxcWcemZvzv9tvzLHx34wjSGPjiIrSyQSCa7+06l061k6OQJFRcVc2P8RmjXP4YFHf5Pp8FNq7NiJ3HHHExQXF3POOccxYMA5ZY6PHDmaJ554GYB69Wpzyy2X0rlz8ntRxFlnXUNeXmP+/vebMxp7qk3+dCbDHn6N4qJi+p1yMKedf0yF5b6Z8S03DvgrV932Kw7p141li1fyt9tfZNXytWRliX6nHsJJ5x6R4egzZ/B9F3PiMT1YunwNvY77U9ThpESWr3he7eSbWXcASc2BF4Ac4GYzmwBMiCIoM7soiuf9sYqKinngzld5ZMgAmufl8Jv+f6XvUfvTrn1eaZleB3ek71H7I4nZXy3khoHP8Y+RW/6w//n8ONq2a8769QVRvISUKSoq4rbbBvPUU7eTl9eEs8++hn79DqZDhz1Ly7Runcdzz91FTk59xoyZwP/+76O89NIDpcefeebftG/fmnXrNkTxElKmuKiYofe/wg2PXEyT5jlc/9uHObDv/rRu12Krci/87Q26HdypdF8ikeBXV5xKu06tyV+/ket+8xBde++z1WN3F8++NIbBw0bx5EOXRh1KylT3QSuxXh7IzJYAA4DLFThK0usAko6UNDm8TZLUINz/J0lTJX0p6e5w3+iwdYikppLmhff3l/R5WMcUSR0l1ZP0Rvj4aZLOraCO/uFzTJN0T0m8ktZJuiN87KeS8ojIjGnf0nrPprRq3YQaNbI59oTujP1gepkydevWomQW8/z8TaX3AZYsWsVHY2dy6pkHZzTudJgy5Wv22qslbdq0oGbNGpx88hG8995nZcr07LkvOTn1AejevTOLFi0rPbZo0TJGjx7P2Wcfn9G402H2jG9p0boJea2akF0jmz7H9mDCuOlblXt7xIf0ProLDXPrl+7LbdqQdp1aA1CnXm1a7ZXHiqWrMxZ7pn30+UxWrFoXdRgplcoVz9Mh1gkPwMzmELwPzcsdGghcFrYG+wL5kk4ETgcONrNuwL3bqf4S4JGwjl4EM4CfACw0s25mdgDwdvIDJO1BsIJvP6A7cJCk08PD9YBPw+ceC/xuB19uyixdvIbmeY1Kt5vn5bB0ydZfTqPfm8q5p97LtZcN5YbbtnTzPXzvSC6/5mSyqvtZ7ipYvHg5LVo0Ld3Oy2vC4sXLKy0/YsQ7HHHEgaXbd975BH/844VkZe36f44rlq6mSdLnonGznK2S1oqlqxk/ZirHnd6n0nqW/LCCeV9/T4f990pXqC4NsnbgFlV8ruIfHB8BD0q6EmhkZoXAscBTZrYBwMxWbKfeT4DrJf0PsJeZ5ROcPzxW0j2S+ppZ+SxxEDDazJaGz/k8UHIiYxPwenh/ItB2h15lCtlWS1NRpgVX4qhjuvCPkX/inod/zZBHRwHw4ZgZ5DauT+f9Wqc9zkwwq9p7AfDpp1MYMeI/DBz4awA++OBzGjfO4YADOqQzxEiVfyuGPfwaP7/0p2QlKv762bihgIeuH8YFV51G3Xq1MxChS5UdWQA2CnE8h1eGpL2BIoIl4vct2W9md0t6AzgJ+FTSsQSJsaKzsoVs+fFQ+hdqZi9I+gw4GRgl6SIze1/SgWG9d0l6x8xuSw5pG+Futi3frkVU8P8vXDV4AMCDj17KBRf9ZBvV/XjN83JYsnhV6faSxatp2qxhpeV79Nqb729czqqV65kyeR7jRs/g4w9nsqlgM+vXF3DLdS9wy10/r/Tx1VmLFk3LdFEuXryc5s0bb1Vu5sy53HjjIJ544hZyc4P36osv/sv773/O2LETKSjYxLp1Gxg48AHuv//ajMWfSo2b5bA86XOxYulqcpvmlCkzZ+YCHrnpWQDWrl7P5I9nkkhkcdCRXSgsLOLB65/m8ON70vuorpkM3aWAqnkbKtYJT1IzguXhHzUzS/5VLqm9mU0Fpko6FOgMvAPcJOkFM9sgqXHYypsHHAh8DpydVMfewBwz+2t4v6ukmcAKM3tO0jrg1+XC+gx4RFJTYCXQHxhU1deUvIrwioKRaRsyte/+bfhu/jIWLlhBs7yGvPv2ZG69u2zC+u7bZbRu0wRJzJqxgM2FReQ0qsulV53EpVedBMAX47/h+WFjdtlkB9ClS0fmzVvId98tIi+vCW+8MZYHHhhYpszChUu44oq7uPfea2jXrlXp/muvvYBrr70AgM8+m8rQoa/ssskOoP2+bVi0YBlLFi6ncbMcPn53Elfc8ssyZQa9fEPp/b/95UV69tmPg47sgpnx9zv/Qau2eZzc/8hMh+5SQPKEV93UkTSZLZclPAs8WEG5qyUdTdCSmgG8ZWYFkroDEyRtIli64nrgfuCfkn4FvJ9Ux7nALyVtBhYBtxF0Wd4nqRjYDPw++UnN7AdJ1wEfELT23jSzf6XkladQdnaCa68/nat//wTFRcX89PTe7N2hBa/88xMAzvzZoYx+dypv/Xsi2dlZ1KpVg7/c+8tKu/p2ZdnZCW666RIuuuhmioqKOeusY+nYcS9efPEtAPr3P5HHHhvOqlVruPXWx4FgROIrrzwUZdhpkchOcOE1Z3LnH4ZQXGQc/dPetNm7Bf959WMAjjuj8vN2s6bMZdzbE9mzfUv+54JgBOt5F59Ejz77VvqYXdmwQVfQ99B9aZrbgNmfPcrtD45g2D9GRx3WTqref9+q6PyD2z2ks4W3q2lcq3PUIVQbk5Z/FXUI1UafHs9HHUK1kf/tizudrVZvervK3zk5NU/IeHaMYwvPOedcWlTvFp4nPOeccynh5/Ccc87FQnUfpVm9o3POObfL0A78t926pBMkzZI0W9KfKzh+WjiD1WRJEyQdvr06vYXnnHMuRVLThpKUAB4DjiOYoWq8pJHlJtd/DxgZXlLWFfgnweVjaY7OOedc7Emq8m07egOzzWyOmW0ChgOnJRcws3VJE3HUo+JJQcrwhOeccy5FUjZ9dCvgu6TtBeG+ss8mnRFO5vEGsN01xjzhOeecS4kdOYcnaUB47q3kNqBMVVvbqgVnZq+aWWeCSf1v3158fg7POedcSohElcsmT4NYgQVAm6Tt1sDCbdQ1VlJ7SU3NbFll5byF55xzLiVSeA5vPNBRUjtJNYHzgJHlnquDwook9QRqApWvy4W38JxzzqVMamZaMbNCSZcDo4AEMNTMpku6JDw+GDgLOD+cqzgfONe2M1emJzznnHMpkcoLz83sTYIJ+pP3DU66fw/BYtlV5gnPOedcivhcms4552LA59J0zjkXC9V9Lk1PeM4551LEuzSdc87FQFUmhY6SJzznnHMpUYXr6yLlCc8551yK+Dk855xzMeCDVpxzzsWCd2k655yLCW/hOeeci4HqPkpT25lr07mdJmlAuBRI7Pl7sYW/F1v4e5EZ1bv96XYXA7ZfJDb8vdjC34st/L3IAE94zjnnYsETnnPOuVjwhOcywc9NbOHvxRb+Xmzh70UG+KAV55xzseAtPOecc7HgCc8551wseMJzzjkXC57wnHPOxYJPLebSQlJ7YIGZFUg6CugKPGNmq6KMK9Mk1QPyzaxY0j5AZ+AtM9sccWgZJ+k2YBzwsZmtjzqeqElqBvwOaEvSd7GZ/SaqmHZ3PkrTpYWkyUAvgj/mUcBIoJOZnRRhWBknaSLQF8gFPgUmABvM7BeRBhYBSb8BDgcOBdYSJL+xZvavSAOLiKSPCd6DiUBRyX4zezmyoHZznvBcWkj6wsx6SvojsNHMBkmaZGY9oo4tk5LehyuAOmZ2bxzfh2SSWgA/AwYCuWbWIOKQIiFpspl1jzqOOPFzeC5dNkvqD1wAvB7uqxFhPFGRpEOBXwBvhPtieSpB0pNhq+ZxgvfgbIKWb1y9LilWPR5R84Tn0uVCgq6rO8xsrqR2wHMRxxSFq4HrgFfNbLqkvYEPog0pMk2ABLAKWAEsM7PCSCOK1lUESW+jpLXhbU3UQe3OvEvTuQyQVM8HagQk7Qv8BPgDkDCz1hGH5GIill0rLn0kTQUq/RVlZl0zGE7kwu7M/wPqA3tK6gZcbGaXRhtZ5kn6KcEAniMIujLfJxi0EVuSTiV4PwBGm9nr2yrvdo4nPJdqPw3/vSz899nw318AGzIfTuQeJmjNjAQwsy8lHbHNR+y+TgTGAo+Y2cKog4mapLuBg4Dnw11XSTrczP4cYVi7Ne/SdGkh6SMzO2x7+3Z3kj4zs4OTR2ZK+tLMukUdWxQk7QV0NLN3JdUBss1sbdRxRUHSFKC7mRWH2wlgUtx6QTLJB624dKkn6fCSDUl9gHoRxhOV78LXbpJqShoI/DfqoKIg6XfACODv4a7WwGuRBVQ9NEq6nxNVEHHhXZouXX4LDJVU8ke8CojjDBKXAI8ArYAFwDts6e6Nm8uA3sBnAGb2taTm0YYUqbuASZI+AERwLu+6aEPavXmXpksrSQ0JPmero47FRat8966kbOCLOHfhSWpJcB5PwGdmtijikHZr3sJzaSGpFnAW4TyBkgAws9siDCvjJA0DriqZQ1RSLvBATOdLHCPpeqCOpOOAS4F/RxxTxknqbGYzJfUMdy0I/91D0h5m9kVUse3uvIXn0kLS28Bqtp4n8IHIgopARdOIxXVqMUlZBF3dxxO0aEYBT1rMvoQkDTGzAWFXZnlmZv0yHlRMeMJzaSFpmpkdEHUcUZP0JXCUma0MtxsDY8ysS7SRORc/3qXp0uVjSV3MbGrUgUTsAYL3YgTBBfk/A+6INqTMkvRPM/tZZZMSxPUcnqRzgLfNbK2kG4GewO1mNini0HZb3sJzaSFpBtABmAsUEHRhWZy+3MIuvEMIRqj2I3gP3jOzGVHGlWmSWpjZovAavK2Y2fxMx1QdSJpiZl3Dy3fuAu4HrjezgyMObbflCc+lhX+5BSR9YmaHRh1HlMIfP88Dw83sm6jjqS6SRqveBUw1sxfien43U/zCc5cWZjY/TG75BN1YJbe4eUfSWSoZphpP/QnmEn1H0meSrpa0R9RBVQPfS/o7QTf3m+HIZv9OTiNv4bm0CCfFfQDYA1gC7AX818z2jzSwDJO0lmCGmUJgI1u6dhtGGlhEJB0CnEtwycps4EUzeyLaqKIhqS5wAkHr7uvwmrwuZvZOxKHttjzhubQIRyf2A94Nu22OBvqb2YCIQ3PVgKSjgIeA/cysVrTRRENSe2CBmRWE70dX4JmSazZd6nnCc2khaYKZ9QoTXw8zK5b0uZn1jjq2TKpsZQQzG5vpWKIm6SCC7s2zgHnAcOAlM1sWZVxRkTQZ6EUwOcMoghU1OpmZr4KeJn5ZgkuXVZLqEywH87ykJQTdenHzx6T7tQnmkpxI0PqNBUl3EnRjriRIcoeZ2YJtPyoWis2sUNKZwMNmNkiSX5KQRp7wXLqcRnDO6g8Ea+HlALGaVgzAzE5J3pbUBrg3onCiUgCcaGZfRR1INbNZUn/gfKDkc1Ijwnh2e96l6VwGhaM1p8RxppVyF1r/L9AD+Etc546UtB/BahqfmNmLktoB55rZ3RGHttvyhOdSKhyVWNGHKpajEyUNYsv7kQV0B+aZ2S8jCyoifqG1i5p3abqUMrMGJff9IloAJiTdLyQYhv9RVMFErGQS8ZOBx83sX5JuiTCeSEmaS8VTre0dQTix4AnPpVPsuw/MbJikmsA+4a5ZUcYTsZILrY8F7vELremVdL82cA7QOKJYYsG7NF3aSPrCzHpuv+TuK7y+ahjBMHwBbYALYnpZgl9ovR2SPjSzw6OOY3flLTyXUuEQ6xKNym1jZq9kOKSoPQAcb2azACTtA7wIHBhpVNH4u5n9qmTDzH6QdC8Qy4SXtAAsBC3dXkCDSoq7FPCE51IteRj+mHLbBsQt4dUoSXYAZvaVpLgOPS8zrZykBPFM/CWSF0MuJFhZ5GcRxRIL3qXpXBpJGkqQ6J8Nd/0SSJjZhdFFlVmSrgOuB+oAG0p2A5uAIWZ2XVSxuXjxhOdSStI12zpuZg9mKpbqIByYcRlwOMGX/Fjgb2ZWEGlgEZB0lye3LcIZaO4tmTtTUi5wrZndGGlguzFPeC6lJN0c3u0EHEQwPyAEXZtjzeyiSAKrBiQ1Blqb2ZSoY4mKpFYEK2eUnk6J4wAeqPiyHR/olV5+Ds+llJndCiDpHaCnma0Nt28BXoowtEhIGg2cSvC3NhlYKmmMmW2zJbw7knQ3cB4wgy3X5BlBqzeOEpJqlbT2JdUBYrlyRKZ4wnPpsifBOZoSmwhmhY+bHDNbI+ki4Ckzu1lSXFt4ZxCsBhC77txKPAe8J+kpgsT/G4JLWFyaeMJz6fIs8LmkVwn+mM8Anok2pEhkh9eb/Qy4IepgIjaHYHJkT3iAmd0b/vg5luD87u1mNirisHZrnvBcWpjZHZLeAvqGuy40szgufXIbwVpnH5nZeEl7A19HHFNUNgCTJb1HUtIzsyujCyk6kuoB75jZ25I6AZ0k1TCzzVHHtrvyQSsubcJJgjua2VOSmgH1zWxu1HG5aEi6oKL9ZhbLbjxJEwl+EOYCnxLMu7rBzH4RaWC7MU94Li3C0Zq9CM7Z7CNpD4LVrQ+LOLSMCmdWeRzIM7MDJHUFTjWzv0QcWiTCgRl7Jl+MH1clIzIlXQHUCbs4fcL1NIrzxK0uvc4gGJ24HsDMFhLPaZOeAK4DNgOElyScF2lEEZF0CsFI1bfD7e6SRm7zQbs3STqUYIHkN8J9fpopjTzhuXTZZEH3gUHp+Yo4qmtmn5fbVxhJJNG7BegNrAIws8lAu+jCidzVBD+GXjWz6eH53Q+iDWn35r8mXLr8M1wKppGk3xEMuX4i4piisExSe7Yk/rOBH6INKTKFZrY6WPS9VGzPqZjZGIL5Zku25wCxHMCTKZ7wXFqY2f2SjgPWEMy6cpOZ/SfisKJwGTAE6Czpe4IJguM6KGGapJ8TXHDdkeDL/eOIY8o4SQ+b2dWS/k3FC8CeGkFYseCDVlxahF2YG82sqGTINfBWXIdch+9HFpAPnGtmz0ccUsaF6+HdABxPcN3ZKIJrzzZGGliGSTrQzCZKOrKi42HLz6WBJzyXFnEfci2pIUHrrhXwL+DdcHsg8KWZnRZheK6aCC/XwcyWRh1LHHjCc2kR9yHXkv4FrAQ+AY4hSPw1gavCwRqxI6kXwTJBbSk7eXTXqGKKgoKTmDcDlxO0dLMIBjINMrPbooxtd+fn8Fy6JA+5/m24L06ft73NrAuApCeBZQTXn62NNqxIPQ/8EZgKFEccS5SuBg4DDiqZiCEcofm4pD+Y2UNRBrc7i9MXkMusq4n3kOvSc5Xhecy5MU92AEvNLM7X3ZU4HzjOzJaV7DCzOZJ+CbwDeMJLE+/SdC4NJBURXnRP0G1Vstq3ADOzhlHFFhVJxwD9gfJzab4SWVARkDTNzA7Y0WNu53kLz6WUD7kOmFki6hiqoQuBzgQrJpR0aRoQq4RH2WWzduSY20newnMp5UOuA+Hq5pUysxWZiqW6kDS15LxmnJVr/Zc5BNQ2sxoZDik2POE5lwaS5hK0XkSwGO7K8H4j4Fszi92UWpKeAB4ysxlRx+Liybs0XVqEM2ncBewH1C7Zb2Z7RxZUBpUkNEmDgZFm9ma4fSLBgp9xdDhwQfhjoIAt5zNjdVmCi4638FxaSPqQ4Fqjh4BTCM7fyMxujjSwDJM00cwOLLdvgpn1iiqmqEjaq6L9ZjY/07G4ePLVEly61DGz9wiS3HwzuwXoF3FMUVgm6UZJbSXtJekGYHnUQUUhTGxtgH7h/Q34d5DLIP+wuXTZKCkL+FrS5ZLOAJpHHVQE+gPNgFfDW7NwX+yEiwL/D8H1mRCM1nwuuohc3HiXpksLSQcB/yUYpHE7kAPcY2afRRlXVCTVN7N1UccRJUmTgR7AFyVTzEma4ufwXKZ4C8+lhZmNN7N1ZrbAzC4EfgZ0iDquTJPUR9IMYEa43U3S3yIOKyq+KLCLlCc8l1KSGkq6TtKjko5X4HJgNkHSi5uHgJ8Qnrczsy+BIyKNKDrlFwV+l3guCuwi4pcluFR7li2rBFxEMFlwTeD0uK4SYGbflVvluyiqWKLkiwK7qHnCc6nmqwSU9Z2kPoBJqkmwyvd/I44pEpLaAeNKkpykOpLamtm8aCNzceFdmi7VyqwSAMR9lYBL2LIQ7AKgO3BplAFF6CXKLgtUFO5zLiO8hedSrZukNeF9AXXC7biuEtCp/Crvkg4DPooonihlm1np5Mhmtils9TqXEd7CcyllZgkzaxjeGphZdtL9uCU7gEFV3BcHSyWVrpYh6TSCLm/nMsJbeM6lQbjaex+gmaRrkg41BOK6dNAlwPOSHiVo8X9HsBiqcxnhCc+59KgJ1Cf4G2uQtH8NcHYkEUXMzL4BDpFUn2DSizif23UR8JlWnEsjSXv55MgBSbWAs4C2JP3YNrPboorJxYu38JxLg5KV34FHJcV25fdy/gWsBiYSLA/kXEZ5wnMuPZ4N/70/0iiql9ZmdkLUQbj48i5N5zJEUi7QxsymRB1LFCQNAQaZ2dSoY3Hx5AnPuTSSNBo4laA3ZTKwFBhjZtds42G7pXAS7Q6Ar3juIuFdms6lV46ZrZF0EfCUmd0sKZYtPODEqANw8eYXnjuXXtmSWhKsFPF61MFEzCq5OZcR3sJzLr1uA0YBH5rZeEl7A19HHFNU3iBIcAJqA+2AWcD+UQbl4sPP4TnnIiGpJ3CxmV0cdSwuHjzhOZcGkv5kZvdKGkQF3XZmdmUEYVU7kr4ws55Rx+Hiwbs0nUuPiyV9BEyIOpDqotycollAT4JRq85lhCc859JjEMFF5y2BfwAvxnXF9yTJc4oWEpzTezmiWFwMeZemc2kkaS/gvPBWG3iRIPnFdeAKkhoQXH+3LupYXLx4wnMuQyT1AIYCXc0sdksESTqAYMq1xuGuZcAFZjYtuqhcnPh1eM6lkaQakk6R9DzwFvAVwYoBcTQEuMbM9jKzvYBrw33OZYSfw3MuDSQdB/QHTgY+B4YDA8xsfaSBRauemX1QsmFmoyXVizIgFy/epelcGkj6AHgBeNnMVkQdT3Ug6VXgC7asJPFLoJeZnR5ZUC5WPOE55zIiXC3iVuBwgtlWxgK3mNnKSANzseEJzznnXCz4OTznXEZI2gcYCLQl6bvHzPpFFZOLF2/hOecyQtKXwGBgIlBUst/MJkYWlIsVT3jOuYyQNNHMDow6DhdfnvCccxkh6RZgCfAqwYrnAPgoVpcpnvCccxkhaW4Fu83M9s54MC6WPOE555yLBR+l6ZzLGEl92HqU5jORBeRixROecy4jJD0LtAcms2WUpgGe8FxGeJemcy4jJP0X2M/8S8dFxFdLcM5lyjSgRdRBuPjyLk3nXKY0BWZI+pwtlyWYmZ0WYUwuRrxL0zmXEZKOTN4kmES6v5ntH1FILma8S9M5lxFmNgZYTbBG4NPAMQRTjTmXEd6l6ZxLq3DS6PMIFsRdDvyDoHfp6EgDc7HjXZrOubSSVAyMA35rZrPDfXN8hhWXad6l6ZxLt7OARcAHkp6QdAzBOTznMspbeM65jJBUDzidoGuzHzAMeNXM3okyLhcfnvCccxknqTFwDnCuLwDrMsUTnnPOuVjwc3jOOediwROec865WPCE55xzLhY84TnnnIuF/wfZJ5z/k/Wf8wAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 432x288 with 2 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "#绘制热力图\n",
    "sns.heatmap(corrDF,annot=True,cmap='YlGnBu')  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 258,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      M\n",
       "1      M\n",
       "2      M\n",
       "3      M\n",
       "4      M\n",
       "      ..\n",
       "475    F\n",
       "476    F\n",
       "477    F\n",
       "478    F\n",
       "479    F\n",
       "Name: Gender, Length: 480, dtype: object"
      ]
     },
     "execution_count": 258,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['Gender']         #查看Gender这一列的详细信息"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 259,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "M    305\n",
       "F    175\n",
       "Name: Gender, dtype: int64"
      ]
     },
     "execution_count": 259,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['Gender'].value_counts()    #查看Gender（性别）中不同值的个数分别是多少"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 260,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>F</th>\n",
       "      <th>M</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>475</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>476</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>477</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>478</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>479</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>480 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     F  M\n",
       "0    0  1\n",
       "1    0  1\n",
       "2    0  1\n",
       "3    0  1\n",
       "4    0  1\n",
       "..  .. ..\n",
       "475  1  0\n",
       "476  1  0\n",
       "477  1  0\n",
       "478  1  0\n",
       "479  1  0\n",
       "\n",
       "[480 rows x 2 columns]"
      ]
     },
     "execution_count": 260,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#独热编码\n",
    "pd.get_dummies(df['Gender'])  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 261,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      M\n",
       "1      M\n",
       "2      M\n",
       "3      M\n",
       "4      M\n",
       "      ..\n",
       "475    F\n",
       "476    F\n",
       "477    F\n",
       "478    F\n",
       "479    F\n",
       "Name: Gender, Length: 480, dtype: object"
      ]
     },
     "execution_count": 261,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['Gender']   #查看性别独热编码后的数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 262,
   "metadata": {},
   "outputs": [],
   "source": [
    "#将object类型的列重定义为category\n",
    "df['Gender']=df['Gender'].astype('category')\n",
    "df['Nationality']=df['Nationality'].astype('category')\n",
    "df['PlaceofBirth']=df['PlaceofBirth'].astype('category')\n",
    "df['StageID']=df['StageID'].astype('category')\n",
    "df['GradeID']=df['GradeID'].astype('category')\n",
    "df['SectionID']=df['SectionID'].astype('category')\n",
    "df['Topic']=df['Topic'].astype('category')\n",
    "df['Semester']=df['Semester'].astype('category')\n",
    "df['Relation']=df['Relation'].astype('category')\n",
    "df['ParentAnsweringSurvey']=df['ParentAnsweringSurvey'].astype('category')\n",
    "df['ParentschoolSatisfaction']=df['ParentschoolSatisfaction'].astype('category')\n",
    "df['StudentAbsenceDays']=df['StudentAbsenceDays'].astype('category')\n",
    "df['Class']=df['Class'].astype('category')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 263,
   "metadata": {},
   "outputs": [],
   "source": [
    "#独热编码\n",
    "df['Gender']=df['Gender'].cat.codes\n",
    "df['Nationality']=df['Nationality'].cat.codes\n",
    "df['PlaceofBirth']=df['PlaceofBirth'].cat.codes\n",
    "df['StageID']=df['StageID'].cat.codes\n",
    "df['GradeID']=df['GradeID'].cat.codes\n",
    "df['SectionID']=df['SectionID'].cat.codes\n",
    "df['Topic']=df['Topic'].cat.codes\n",
    "df['Semester']=df['Semester'].cat.codes\n",
    "df['Relation']=df['Relation'].cat.codes\n",
    "df['ParentAnsweringSurvey']=df['ParentAnsweringSurvey'].cat.codes\n",
    "df['ParentschoolSatisfaction']=df['ParentschoolSatisfaction'].cat.codes\n",
    "df['StudentAbsenceDays']=df['StudentAbsenceDays'].cat.codes\n",
    "df['Class']=df['Class'].cat.codes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 264,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Gender                       int8\n",
       "Nationality                  int8\n",
       "PlaceofBirth                 int8\n",
       "StageID                      int8\n",
       "GradeID                      int8\n",
       "SectionID                    int8\n",
       "Topic                        int8\n",
       "Semester                     int8\n",
       "Relation                     int8\n",
       "RaisedHands                 int64\n",
       "VisitedResources            int64\n",
       "AnnouncementsView           int64\n",
       "Discussion                  int64\n",
       "ParentAnsweringSurvey        int8\n",
       "ParentschoolSatisfaction     int8\n",
       "StudentAbsenceDays           int8\n",
       "Class                        int8\n",
       "dtype: object"
      ]
     },
     "execution_count": 264,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.dtypes      #查看独热编码后的数据类型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 265,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Gender</th>\n",
       "      <th>Nationality</th>\n",
       "      <th>PlaceofBirth</th>\n",
       "      <th>StageID</th>\n",
       "      <th>GradeID</th>\n",
       "      <th>SectionID</th>\n",
       "      <th>Topic</th>\n",
       "      <th>Semester</th>\n",
       "      <th>Relation</th>\n",
       "      <th>RaisedHands</th>\n",
       "      <th>VisitedResources</th>\n",
       "      <th>AnnouncementsView</th>\n",
       "      <th>Discussion</th>\n",
       "      <th>ParentAnsweringSurvey</th>\n",
       "      <th>ParentschoolSatisfaction</th>\n",
       "      <th>StudentAbsenceDays</th>\n",
       "      <th>Class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>15</td>\n",
       "      <td>16</td>\n",
       "      <td>2</td>\n",
       "      <td>20</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>20</td>\n",
       "      <td>20</td>\n",
       "      <td>3</td>\n",
       "      <td>25</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>10</td>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "      <td>30</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>30</td>\n",
       "      <td>25</td>\n",
       "      <td>5</td>\n",
       "      <td>35</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>40</td>\n",
       "      <td>50</td>\n",
       "      <td>12</td>\n",
       "      <td>50</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>475</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>4</td>\n",
       "      <td>5</td>\n",
       "      <td>8</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>476</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>50</td>\n",
       "      <td>77</td>\n",
       "      <td>14</td>\n",
       "      <td>28</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>477</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>55</td>\n",
       "      <td>74</td>\n",
       "      <td>25</td>\n",
       "      <td>29</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>478</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>6</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>30</td>\n",
       "      <td>17</td>\n",
       "      <td>14</td>\n",
       "      <td>57</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>479</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>6</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>35</td>\n",
       "      <td>14</td>\n",
       "      <td>23</td>\n",
       "      <td>62</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>480 rows × 17 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     Gender  Nationality  PlaceofBirth  StageID  GradeID  SectionID  Topic  \\\n",
       "0         1            4             4        2        1          0      7   \n",
       "1         1            4             4        2        1          0      7   \n",
       "2         1            4             4        2        1          0      7   \n",
       "3         1            4             4        2        1          0      7   \n",
       "4         1            4             4        2        1          0      7   \n",
       "..      ...          ...           ...      ...      ...        ...    ...   \n",
       "475       0            3             3        1        5          0      2   \n",
       "476       0            3             3        1        5          0      5   \n",
       "477       0            3             3        1        5          0      5   \n",
       "478       0            3             3        1        5          0      6   \n",
       "479       0            3             3        1        5          0      6   \n",
       "\n",
       "     Semester  Relation  RaisedHands  VisitedResources  AnnouncementsView  \\\n",
       "0           0         0           15                16                  2   \n",
       "1           0         0           20                20                  3   \n",
       "2           0         0           10                 7                  0   \n",
       "3           0         0           30                25                  5   \n",
       "4           0         0           40                50                 12   \n",
       "..        ...       ...          ...               ...                ...   \n",
       "475         1         0            5                 4                  5   \n",
       "476         0         0           50                77                 14   \n",
       "477         1         0           55                74                 25   \n",
       "478         0         0           30                17                 14   \n",
       "479         1         0           35                14                 23   \n",
       "\n",
       "     Discussion  ParentAnsweringSurvey  ParentschoolSatisfaction  \\\n",
       "0            20                      1                         1   \n",
       "1            25                      1                         1   \n",
       "2            30                      0                         0   \n",
       "3            35                      0                         0   \n",
       "4            50                      0                         0   \n",
       "..          ...                    ...                       ...   \n",
       "475           8                      0                         0   \n",
       "476          28                      0                         0   \n",
       "477          29                      0                         0   \n",
       "478          57                      0                         0   \n",
       "479          62                      0                         0   \n",
       "\n",
       "     StudentAbsenceDays  Class  \n",
       "0                     1      2  \n",
       "1                     1      2  \n",
       "2                     0      1  \n",
       "3                     0      1  \n",
       "4                     0      2  \n",
       "..                  ...    ...  \n",
       "475                   0      1  \n",
       "476                   1      2  \n",
       "477                   1      2  \n",
       "478                   0      1  \n",
       "479                   0      1  \n",
       "\n",
       "[480 rows x 17 columns]"
      ]
     },
     "execution_count": 265,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df            #查看独热编码后的数据信息"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 266,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Gender</th>\n",
       "      <th>Nationality</th>\n",
       "      <th>PlaceofBirth</th>\n",
       "      <th>StageID</th>\n",
       "      <th>GradeID</th>\n",
       "      <th>SectionID</th>\n",
       "      <th>Topic</th>\n",
       "      <th>Semester</th>\n",
       "      <th>Relation</th>\n",
       "      <th>RaisedHands</th>\n",
       "      <th>VisitedResources</th>\n",
       "      <th>AnnouncementsView</th>\n",
       "      <th>Discussion</th>\n",
       "      <th>ParentAnsweringSurvey</th>\n",
       "      <th>ParentschoolSatisfaction</th>\n",
       "      <th>StudentAbsenceDays</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>15</td>\n",
       "      <td>16</td>\n",
       "      <td>2</td>\n",
       "      <td>20</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>20</td>\n",
       "      <td>20</td>\n",
       "      <td>3</td>\n",
       "      <td>25</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>10</td>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "      <td>30</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>30</td>\n",
       "      <td>25</td>\n",
       "      <td>5</td>\n",
       "      <td>35</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>40</td>\n",
       "      <td>50</td>\n",
       "      <td>12</td>\n",
       "      <td>50</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>475</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>4</td>\n",
       "      <td>5</td>\n",
       "      <td>8</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>476</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>50</td>\n",
       "      <td>77</td>\n",
       "      <td>14</td>\n",
       "      <td>28</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>477</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>55</td>\n",
       "      <td>74</td>\n",
       "      <td>25</td>\n",
       "      <td>29</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>478</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>6</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>30</td>\n",
       "      <td>17</td>\n",
       "      <td>14</td>\n",
       "      <td>57</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>479</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>6</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>35</td>\n",
       "      <td>14</td>\n",
       "      <td>23</td>\n",
       "      <td>62</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>480 rows × 16 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     Gender  Nationality  PlaceofBirth  StageID  GradeID  SectionID  Topic  \\\n",
       "0         1            4             4        2        1          0      7   \n",
       "1         1            4             4        2        1          0      7   \n",
       "2         1            4             4        2        1          0      7   \n",
       "3         1            4             4        2        1          0      7   \n",
       "4         1            4             4        2        1          0      7   \n",
       "..      ...          ...           ...      ...      ...        ...    ...   \n",
       "475       0            3             3        1        5          0      2   \n",
       "476       0            3             3        1        5          0      5   \n",
       "477       0            3             3        1        5          0      5   \n",
       "478       0            3             3        1        5          0      6   \n",
       "479       0            3             3        1        5          0      6   \n",
       "\n",
       "     Semester  Relation  RaisedHands  VisitedResources  AnnouncementsView  \\\n",
       "0           0         0           15                16                  2   \n",
       "1           0         0           20                20                  3   \n",
       "2           0         0           10                 7                  0   \n",
       "3           0         0           30                25                  5   \n",
       "4           0         0           40                50                 12   \n",
       "..        ...       ...          ...               ...                ...   \n",
       "475         1         0            5                 4                  5   \n",
       "476         0         0           50                77                 14   \n",
       "477         1         0           55                74                 25   \n",
       "478         0         0           30                17                 14   \n",
       "479         1         0           35                14                 23   \n",
       "\n",
       "     Discussion  ParentAnsweringSurvey  ParentschoolSatisfaction  \\\n",
       "0            20                      1                         1   \n",
       "1            25                      1                         1   \n",
       "2            30                      0                         0   \n",
       "3            35                      0                         0   \n",
       "4            50                      0                         0   \n",
       "..          ...                    ...                       ...   \n",
       "475           8                      0                         0   \n",
       "476          28                      0                         0   \n",
       "477          29                      0                         0   \n",
       "478          57                      0                         0   \n",
       "479          62                      0                         0   \n",
       "\n",
       "     StudentAbsenceDays  \n",
       "0                     1  \n",
       "1                     1  \n",
       "2                     0  \n",
       "3                     0  \n",
       "4                     0  \n",
       "..                  ...  \n",
       "475                   0  \n",
       "476                   1  \n",
       "477                   1  \n",
       "478                   0  \n",
       "479                   0  \n",
       "\n",
       "[480 rows x 16 columns]"
      ]
     },
     "execution_count": 266,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X=df.drop('Class',axis=1)   #删掉Class这一列\n",
    "y=df['Class']               #y为标签\n",
    "X"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 267,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(480, 16)\n",
      "(480, 16)\n"
     ]
    }
   ],
   "source": [
    "print(X.shape)                   #查看有多少特征\n",
    "#特征选择\n",
    "selector = VarianceThreshold()   #优先消除方差为0的特征\n",
    "X_feature_selection = selector.fit_transform(X)\n",
    "print(X_feature_selection.shape) #输出经过方差为0的过滤后还有多少特征"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 从结果中可以看到原本数据中有16个特征，经过阈值为 0 的方差过滤之后，剩下16个特征，也就是说之前有0个特征的方差都为0。剩下的16个特征还是比较多的，并不能满足我们的需求，此时我们还需要进一步的特征选择。由于单纯调整阈值比较抽象，我们并不知道特定阈值下会留下多少个特征，留下特征过多或者过少都对我们的结果不利，所以我们可以留下指定数量的特征，比如留下一半的特征，找到特征方差的中位数，再将这个中位数作为 threshold 的值就可以让特征总数减半"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 268,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(480, 8)\n"
     ]
    }
   ],
   "source": [
    "#找到特征方差的中位数，再将这个中位数作为 threshold 的值就可以让特征总数减半\n",
    "selector = VarianceThreshold(np.median(X.var().values))\n",
    "X_feature_selection = selector.fit_transform(X)\n",
    "print(X_feature_selection.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "此时特征总数已经减半，只剩下了八个特征"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 269,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "219    2\n",
       "183    2\n",
       "453    2\n",
       "380    1\n",
       "308    2\n",
       "Name: Class, dtype: int8"
      ]
     },
     "execution_count": 269,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#将数据拆分为训练集与测试集，其中测试集用于评测模型的优劣,其中选取数据集中20%用于测试\n",
    "X_train,X_test,y_train,y_test=train_test_split(X_feature_selection,y,test_size=0.2,random_state=10)  \n",
    "y_test.head(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 270,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "((384, 8), (96, 8), (384,), (96,))"
      ]
     },
     "execution_count": 270,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X_train.shape,X_test.shape,y_train.shape,y_test.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 271,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "d:\\anaconda\\lib\\site-packages\\sklearn\\linear_model\\_logistic.py:814: ConvergenceWarning: lbfgs failed to converge (status=1):\n",
      "STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.\n",
      "\n",
      "Increase the number of iterations (max_iter) or scale the data as shown in:\n",
      "    https://scikit-learn.org/stable/modules/preprocessing.html\n",
      "Please also refer to the documentation for alternative solver options:\n",
      "    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression\n",
      "  n_iter_i = _check_optimize_result(\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "LogisticRegression(random_state=0)"
      ]
     },
     "execution_count": 271,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#训练并且测试模型\n",
    "#1.逻辑回归模型\n",
    "Logit=LogisticRegression(random_state=0)\n",
    "Logit.fit(X_train,y_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 272,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Predict [2 1 2 0 0 0 0 0 2 1 0 1 2 2 2 0 1 0 0 0 1 2 2 2 2 1 0 2 2 0 0 0 1 2 0 1 2\n",
      " 0 1 1 2 0 0 2 2 0 0 2 1 0 0 1 2 0 2 0 0 1 2 0 0 1 2 0 2 0 2 0 2 0 2 1 1 2\n",
      " 2 2 2 1 0 0 1 0 1 1 2 2 1 2 0 0 2 1 2 2 2 0]\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "(96,)"
      ]
     },
     "execution_count": 272,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#2.模型预测\n",
    "Predict=Logit.predict(X_test)\n",
    "print('Predict',Predict)\n",
    "Predict.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 273,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "219    2\n",
       "183    2\n",
       "453    2\n",
       "380    1\n",
       "308    2\n",
       "      ..\n",
       "242    1\n",
       "361    2\n",
       "190    1\n",
       "68     0\n",
       "131    2\n",
       "Name: Class, Length: 96, dtype: int8"
      ]
     },
     "execution_count": 273,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y_test"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 274,
   "metadata": {},
   "outputs": [],
   "source": [
    "#3.模型测试\n",
    "Score=accuracy_score(y_test,Predict)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 275,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "accuracy 0.6354166666666666\n"
     ]
    }
   ],
   "source": [
    "print('accuracy',Score)        #将数据放到逻辑回归模型中，其中预测结果分数大约为0.64"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.9.12 ('base')",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.12"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "5179d32cf6ec497baf3f8a3ef987cc77c5d2dc691fdde20a56316522f61a7323"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
